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Summary 



Examining district guidance to 
schools on teacher evaluation 
policies in the Midwest Region 



This descriptive study provides a snap- 
shot of teacher evaluation policies across 
a demographically diverse sample of dis- 
tricts in the Midwest Region. It aims to lay 
the groundwork for further research and 
inform conversations about current poli- 
cies at the local, district, and state levels. 

Effective teaching is a cornerstone of educa- 
tion reform (Whitehurst, 2002) and is critical 
for student academic achievement. But teach- 
ers’ abilities to promote student learning vary 
within and across schools (Aaronson, Barrow, 
& Sander, 2003; Nye, Konstantopolous, & 
Hedges, 2004; Rivkin, Hanushek, & Kain, 2005; 
Rockoff, 2004). Research finds that an impor- 
tant tool for improving teacher effectiveness is 
the teacher evaluation (Danielson & McGreal, 
2000; Howard & McColskey, 2001; Shinkfield 
& Stufflebean, 1995; Stronge, 1995). Federal 
highly qualified teacher requirements have led 
to a surge of state and local education agencies 
developing new systems to evaluate teachers. 

But studies of evaluation policies and their 
influence on teacher practice are scarce 



(Peterson, 2000), and the few that exist are 
usually descriptive, outdated, and leave many 
questions unanswered. For example, 

• What does the landscape of teacher evalu- 
ation policy at the district level look like 
today? 

• What can be learned about the policy pro- 
cess by examining district documents? 

This study — which tries to answer these two 
questions — is the first systematic effort to 
describe evaluation policies across a demo- 
graphically diverse sample of districts in the 
Midwest Region. School district policy for 
evaluating teachers varies widely across the 
region — both in the evaluation practices speci- 
fied in the policy documents and in the details 
of the policy prescriptions. 

This study examines district evaluation policy 
documents for evidence of 13 common teacher 
evaluation practices (Ellett & Garland, 1987; 
Foup, Garland, Ellett, & Rugutt, 1996). In 
general, district policy documents were more 
apt to specify the processes involved in teacher 
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evaluation (who conducts the evaluation, when, 
and how often) than they were to provide 
guidance for the content of the evaluation, the 
standards by which the evaluation would be 
conducted, or the use of the evaluation results. 
District policies also varied in how specific 
they were, though the tendency was to be less, 
rather than more, specific for the 13 evaluation 
practices examined. Two-thirds of the district 
teacher evaluation policy documents provided 
guidance for fewer than half of the 13 practices. 
No policies specified more than 10 evaluation 
practices, and nearly 16 percent reflected none 
of these practices. The most commonly refer- 
enced practice was how often evaluations are 
to be conducted (67 percent), followed by what 
evaluation tools are to be used (59 percent) and 
what methods are to be used (49 percent). 

The study also finds that Midwest Region dis- 
tricts evaluate teachers primarily to help de- 
cide whether to retain or release new teachers. 
School principals and administrators do most 
of the evaluations and, at the district’s direc- 
tion, focus on beginning teachers. Beginning 
teachers are typically evaluated two or more 
times a year, and experienced teachers just 
once every two or three years. Several other 
patterns emerge from the findings: 

• Many district policies distinguish between 
beginning and experienced teachers. 



• Few policies spell out consequences for 
unsatisfactory evaluations. 

• Few districts reference using resources or 
guidance to support evaluations. 

• Most evaluations are summative reports 
used to support decisions about retaining 
teachers and granting tenure, rather than 
for professional development. 

• Few district policies require evaluators to 
be trained. 

• Vague terminology leaves evaluation poli- 
cies open to interpretation. 

• The specificity of policy and procedures 
varies across districts. 

The report’s findings lay the groundwork for 

additional research, identifying several ques- 
tions that need further investigation: 

• What is the role of state departments 
of education in the teacher evaluation 
process? 

• How do policy variations affect teacher 
evaluation at the local level? 

• What is the influence of district policy in 
evaluating beginning teachers, tenured 
teachers, and unsatisfactory teachers? 

• What is the impact of different evalu- 
ation models and practices on teacher 
effectiveness? 

November 2007 
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This descriptive 
study provides 
a snapshot 
of teacher 
evaluation 
policies across a 
demographically 
diverse sample 
of districts in the 
Midwest Region. 
It aims to lay the 
groundwork for 
further research 
and Inform 
conversations 
about current 
policies at the 
local, district, 
and state levels. 



WHY THIS REPORT? 

Effective teaching is a cornerstone of educa- 
tion reform (Whitehurst, 2002) and is critical 
for student academic achievement. But teachers’ 
abilities to promote student learning vary both 



within and across schools (Aaronson, Barrow, & 
Sander, 2003; Nye, Konstantopolous, & Hedges, 
2004; Rivkin, Hanushek, & Kain, 2005; Rockoff, 
2004). Research finds that an important element 
for improving teacher effectiveness is the teacher 
evaluation (Danielson & McGreal, 2000; Howard 
& McColskey, 2001; Shinkfield & Stutffebean, 1995; 
Stronge, 1995). Federal highly qualified teacher 
requirements have led to a surge of states and local 
education agencies developing new systems to 
evaluate teachers. 

But while new evaluation systems emerge, stud- 
ies of evaluation policies and their influence on 
teacher practice remain scarce (Peterson, 2000), 
and the few that exist are usually descriptive, 
outdated, and leave many questions unanswered. 
For example, 

• What does the landscape of teacher evaluation 
policy look like today? 

• What can be learned about the policy process 
by examining district documents? 

This study — which tries to answer these two 
questions — is the first systematic effort to describe 
evaluation policies across a demographically 
diverse sample of districts in the Midwest Region. 
School district policy for evaluating teachers varies 
widely across the region — both in the evaluation 
practices specified in the policy documents and in 
the details of the policy prescriptions. 

This study examines district evaluation policy 
documents for evidence of 13 common teacher 
evaluation practices (Ellett & Garland, 1987; Foup, 
Garland, Ellett, & Rugutt, 1996). In general district 
policy documents were more apt to specify the 
processes involved in teacher evaluation (who con- 
ducts the evaluation, when, and how often) than 
they were to provide guidance for the content of 
the evaluation, the standards by which the evalua- 
tion would be conducted, or the use of the evalu- 
ation results. District policies also varied in how 
specific they were, though the tendency was to be 
less, rather than more, specific for the 13 evalua- 
tion practices examined. Fully two-thirds of the 
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The study finds that 
Midwestern districts 
evaluate teachers 
primarily to help decide 
whether to retain or 
release new teachers 



district teacher evaluation policy 
documents provided guidance for 
fewer than half of the 13 practices. 
No policies specified more than 10 
evaluation practices, and nearly 16 
percent of the districts’ evaluation 
policies reflected none of these 
practices. The most commonly 
referenced practice was how often evaluations are 
to be conducted (67 percent), followed by what 
evaluation tools are to be used (59 percent) and 
what methods are to be used (49 percent). 



The study also finds that Midwestern districts 
evaluate teachers primarily to help decide 
whether to retain or release new teachers. School 
principals and administrators do most of the 
evaluation and, at the district’s direction, focus 
on beginning teachers. Beginning teachers are 
typically evaluated two or more times a year, 
while experienced teachers just once every two or 
three years. Several other patterns emerge from 
the findings: 

• Many district policies distinguish between 
beginning and experienced teachers. 

• Few policies spell out consequences for unsat- 
isfactory evaluations. 

• Few districts reference using resources or 
guidance to support evaluations. 

• Most evaluations are summative reports used 
to support decisions about retaining teachers 
and granting tenure, rather than for profes- 
sional development. 

• Few district policies specify training for 
evaluators. 

• Vague terminology leaves evaluation policies 
open to interpretation. 

• The specificity of policy and procedures varies 
across districts. 



• What is the influence of district policy in 
evaluating beginning teachers, tenured teach- 
ers, and unsatisfactory teachers? 

• What is the impact of different evaluation 
models and practices on teacher effectiveness? 



THE LITERATURE ON EVALUATION POLICIES 

Only two studies have examined teacher evalu- 
ation policies on a large scale (Ellett & Garland, 
1987; Loup et ah, 1996). Both used the Teacher 
Evaluation Practices Survey (TEPS) (Ellett & 
Garland, 1987) of superintendents to collect 
information about teacher evaluation policies 
and procedures in the nation’s 100 largest school 
districts. The TEPS is divided into three sec- 
tions: teacher evaluation policies, purposes, and 
practices. The policy section asks how teachers 
are informed of the policy, who will be evaluated, 
how often, and what are the standards for accept- 
able teaching. The purpose section asks about the 
primary purposes for conducting evaluations and 
how districts use evaluation results, whether for 
personnel decisions or professional development. 
The practices section examines who conducts the 
evaluation, what evaluation methods are used, and 
what training is required for evaluators. 

Ellett and Garland’s survey (1987) of superinten- 
dents and their analysis of district policy docu- 
ments suggested that teacher evaluations were 
most often used for dismissal or remediation 
rather than for professional development, district 
policies rarely established performance standards 
or evaluator training, few districts permitted 
external or peer evaluations, and superintendents 
tended to present their district policies more favor- 
ably than did independent reviewers. 



These findings lay the groundwork for additional 
research, identifying several questions that need 
further investigation: 

• What is the role of state departments of edu- 
cation in the teacher evaluation process? 

• How do policy variations affect teacher evalu- 
ation at the local level? 



A decade later Loup et al. (1996) conducted a fol- 
low-up study, adapting Ellett and Garland’s survey 
to measure superintendents’ opinions about the 
effectiveness of their teacher evaluation systems. 
The results mirrored those of Ellett and Garland — 
teacher evaluation policies in large districts had 
changed little in 10 years. But superintendents 
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were no longer satisfied with the status quo, and 
many reported a need to revise their existing 
evaluation tools and procedures. 

The following sections use the TEPS framework 
to review other relevant research and professional 
guidance on teacher evaluation. 



Evaluation policies and purposes 

According to Loup et al. (1996), most policies 
give supervisors and assistant superintendents 
the district-level responsibility for evaluating 
teachers. Principals and assistant principals are 
most often responsible at the building level. The 
conflicting roles of principals who must serve both 
as instructional leaders and as evaluators are the 
biggest problem with teacher evaluation accord- 
ing to Wise, Darling-Hammond, McLaughlin, 
and Bernstein (1984). Many principals, wanting to 
maintain collegial relations with staff members, 
are reluctant to criticize them. It is not surprising, 
then, that Johnson (1990) reports that many teach- 
ers believe that administrators are not competent 
evaluators. Several studies “depict principals as 
inaccurate raters both of individual teacher per- 
formance behaviors and of overall teacher merit” 
(Peterson, 1995, p.l5). According to Dwyer and 
Stufflebeam (1996), principals usually rate all their 
teachers as performing at acceptable levels— a 
welcome, but unlikely, result. 

How often teachers are evaluated usually depends 
on their years of experience. Sweeney and Manatt 
(1986) report that most districts observe nonten- 
ured teachers more frequently than tenured teach- 
ers. Guidance on teacher evaluation, however, rec- 
ommends supervising beginning teachers rather 
than evaluating them (Nolan & Hoover, 2005). 

The frequency of evaluation may also be related to 
what assessment tool a district uses. Some districts 
may require observations more frequently than 
portfolio assessments. Others may use alternative 
tools, such as the interview protocol and scoring 
rubric developed by Plowers and Hancock (2003). 
This tool is designed to accurately and quickly 



assess a teacher’s ability to assess and modify 
instruction to improve student achievement. 



Researchers have encouraged districts to modify 
their practices in ways that increase the likelihood 
of evaluation informing teachers’ improvement 
efforts, but there has not been strong evidence that 
this advice has had its intended effect. Researchers 
and teachers alike often assume that the intended 
effect is to inform professional development and 
recertification (Clark, 1993). 



Although districts with 
performance-based 
compensation programs 
use evaluations to 
determine salary 
increases, few use 
them to improve 
teacher practices 



But although districts 
with performance-based 
compensation programs 
use evaluations to 
determine salary in- 
creases (Schacter, Thum, 

Reifsneider, & Schiff, 

2004), few use them to 
improve teacher prac- 
tices. Researchers have 

suggested that successful evaluation depends on 
clear communication between administrators and 
teachers (Darling-Hammond, Wise, & Pease, 1983; 
Stronge, 1997). To facilitate this, researchers have 
advocated involving teachers in the design and 
implementation of the evaluation process (Kyriak- 
ides, Demetriou, & Charalambous, 2006). But case 
studies show that policies and procedures are not 
regularly communicated to teachers; teachers are 
thus more likely to see the evaluation as a sum- 
mative report generated to meet district or state 
requirements (Zepeda, 2002) than as an opportu- 
nity to gauge and improve their teaching capacity. 



Evaluation practices 

Stronge (2002) holds that measures of teacher 
planning — such as unit and lesson plans, student 
work samples, and analyses of student work — are 
related to student success and should be evalu- 
ated. Quality lesson plans should link learning 
objectives with activities, keep students’ atten- 
tion, align objectives with the district and state 
standards, and accommodate students with 
special needs (Stronge, 2002). And for the lesson 
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to be effectively delivered, teachers must be able 
to manage the classroom. One suggested way to 
manage the classroom learning environment is 
to establish routines that support quality interac- 
tions between teachers and students and between 
students and their peers (Ellett & Garland, 1987; 
Loup et al., 1996). 

Teachers often have more than one role, which 
can require “technological, administrative, and 
social skills in addition to those needed for routine 
teaching roles” (Rosenblatt, 2001, p. 685). Some 
districts may thus also evaluate teacher contribu- 
tions to committees, commitment to professional 
development, and communication with col- 
leagues and parents (Ellett & Garland, 1987; Loup 
et al, 1996). 

Studies find that districts use a variety of meth- 
ods to evaluate teachers: observations (Mujis, 
2006), teacher work samples (Schalock, 1998), 
interview protocols (flowers & Hancock, 2003), 
and portfolio assessments (Gellman, 1992) are 
just a few. The portfolio assessment has garnered 
particular attention from administrators, teach- 
ers, and researchers. Teachers and administrators 
find portfolios useful tools for helping teachers 
self-reflect, identify strengths and weaknesses, 
and grow professionally (Tucker, Stronge, 8c Ga- 
reis, 2002). Researchers are concerned, however, 
whether portfolios accurately reflect what occurs 
in classrooms (Tucker et al., 2002). 



Research on teacher 
evaluation policies 
is usually descriptive 
and often outdated 



Whether the evaluation method should depend 
on a teacher’s years of experience is debated. 
Some researchers advocate using one evaluation 
system for all teachers (Danielson & McGreal, 
2000; McLaughlin 8c Pfiefer, 1988; Stronge, 1997), 
while others argue that using different evaluation 
systems for beginning and experienced teach- 
ers is more valuable (Millman & Schalock, 1997; 

Stiggans Sc Duke, 1988; Beerens, 
2000; Danielson & McGreal, 
2000). Peterson (2000), for ex- 
ample, argues that new teacher 
evaluations should provide 
opportunities for professional 



development in areas the teacher’s preparation 
program did not address. 

Several case studies suggest that evaluators need 
to be trained to properly assess teacher behaviors 
and characteristics (Wise et al, 1984; Stiggans Sc 
Duke, 1988). Large-scale studies on teacher evalu- 
ation policies have, however, found little evidence 
of comprehensive evaluation training programs 
(Loup et al, 1996). 



ADDRESSING QUESTIONS LEFT 
UNANSWERED BY THE LITERATURE 

As the literature review makes evident, research 
on teacher evaluation policies is usually descrip- 
tive and often outdated. Professional guidance 
may be more abundant, but rarely references its 
research base. Many questions about district poli- 
cies are left unanswered. Lor example, 

• What does the landscape of teacher evaluation 
policy look like today? 

• What can be learned about the policy process 
by examining district documents? 

This study begins to answer these questions— 
and to identify where future research may be 
warranted — by looking at Midwest Region district 
policies on teacher evaluation. The findings will 
deepen the field’s understanding of evaluation 
policies and inform administrators’ conversations 
about the state of current policies. 

The study selected a sample of 216 districts demo- 
graphically representative of districts across the 
Midwest Region; 140 participated in the study and 
provided researchers with their teacher evalu- 
ation policies and procedures. The policies and 
procedures were coded according to 13 research 
questions adapted from the Teacher Evaluation 
Practices Survey (Ellett Sc Garland, 1987; Loup 
et al., 1996). (Box 1 summarizes the data sources 
and methodology, appendix A further discusses 
sample selection, and appendix B provides de- 
tailed findings.) 
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BOX 1 

Data and methodology 

This study used a stratified sample 
design to select school districts that 
are demographically representative 
of the majority of districts across the 
Midwest Region. A data file contain- 
ing 3,126 public school districts in 
Illinois, Indiana, Iowa, Michigan, 
Minnesota, Ohio, and Wisconsin was 
generated using the National Center 
for Education Statistics Common 
Core of Data (U.S. Department of 
Education, 2004/05). 

Sample districts were selected ac- 
cording to locale (urban, suburban, 
rural), the percentage of students 
in the district eligible for free or 
reduced-price lunch (less than or 
more than 40 percent of eligible stu- 
dents), and the percentage of minor- 
ity student enrollment in the district 
(less than or more than 40 percent 
non-Caucasian minority students).^ 

In all, 216 districts were selected. 

Methodology 

Sample districts were asked to provide 
their teacher evaluation policies and 
procedures. These data were first 
coded according 13 research questions 



TEACHER EVALUATION POLICY 
IN THE MIDWEST REGION 



adapted from the Teacher Evaluation 
Practices Survey (TEPS) (Ellett & Gar- 
land, 1987; Loup et al., 1996) and then 
according to the common and unique 
policies and procedures reported by 
each district. Summary reports were 
created for each research question, 
using counts of codes and representa- 
tive quotations from the documents to 
clarify the findings and identify areas 
for further inquiry. 

The data were analyzed using SPSS 
Complex Samples add-on module, 
which calculates common statistics, 
such as the standard errors reported 
in this paper, based on the param- 
eters of the plan used to select the 
representative sample. Because dif- 
ferent groups had different sampling 
probabilities in the sample selection 
process, these weights were also 
incorporated into the data analysis by 
the Complex Samples add-on. 

Research questions 

Evaluation standards and criteria 

• What specific criteria are to be 
evaluated? 

• What external resources were 
used to inform evaluations?**^ 

• What training is required of 
evaluators? 



• Do evaluation policies differ 
by content area and/or special 
populations?** 

Evaluation processes 

• How often are evaluations to be 
conducted? 

• What evaluation tools are to be 
used? 

• What methods of evaluation are 
suggested or required? 

• Who has responsibility for con- 
ducting the teacher evaluation? 

• When are evaluations conducted?** 

• How is the teacher evaluation 
policy to be communicated to 
teachers? 

• What formal grievance pro- 
cedures are to be in place for 
teachers? 

Evaluation results 

• How are teacher evaluation 
results to be used? 

• How are teacher evaluation 
results to be reported? 

Notes 

• These questions were added to the TEPS 
categories to capture additional details 
about evaluation requirements. 

1. District size was initially included in 
the selection strata, but was dropped 
because size was strongly and positively 
correlated with district locale. 



evaluation results (table 1). It is clear that district 
policy documents frequently referenced evalua- 
tion processes (such as who conducts the evalua- 
tion, when, and how often). Of the 13 evaluation 
practices examined, the three most commonly 
referenced all pertained to process: how often 
evaluations are to be conducted (67 percent), 
which tools (checklist, rating scale) are to be used 
(59 percent), and what methods (observation, 
portfolio, professional development plans) are to 
be used (49 percent). 



District policy for evaluating teachers varies 
widely across the region — both in the evaluation 
practices specified in the policy documents and 
in the details of the policy prescriptions. Still, 
patterns in the data reflect some commonalities 
across the region. The 13 evaluation practices 
were grouped into three categories: evaluation 
standards and criteria, evaluation processes, and 
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Practices pertaining to the standards and criteria 
that inform the evaluations are less frequently 
referenced. Fewer than two in five districts (39 
percent) provided details in their evaluation 
policy about the actual criteria used to rate teacher 
practice. One in five districts (21 percent) indi- 
cated that they used external resources, such as 
evaluation models or standards frameworks, in 
the design of their system. Only 8 percent of all 



district policies referenced any form of training or 
certification criteria for raters, and a mere 5 per- 
cent of all districts recommended different policies 
based on differences in subject area or student 
population. 

Between these extremes — the more frequent 
reference to evaluation processes and the less 
frequent reference to evaluation standards and 



TABLE 1 

Percentage of districts specifying evaluation practices within teacher evaluation policy documents 



Teacher evaluation practices 


Total 

n=140 




Locale 




Rural 

n=61 


Suburban 

n=67 


Urban 

n=12 


Evaluation standards and criteria 




39 


33 


37 


45 


What specific criteria are to be evaluated? 


(6.1) 


(8.4) 


(9.3) 


(21.2) 




21 


21 


18 


23 


What external resources were used to inform evaluations? 


(3.1) 


(4.3) 


(4.5) 


(12.6) 




8 


10 


6 


6 


What training is required of evaluators? 


(2.1) 


(2.7) 


(3.5) 


(6.6) 




5 


0* 


8 


41 


Do evaluation policies differ by content area and/or special populations? 


(4.2) 


(0) 


(3.9) 


(24.6) 


Evaluation processes 




67 


68 


61 


78 


How often are evaluations to be conducted? 


(4.3) 


(5.6) 


(6.5) 


(12.1) 




59 


60 


54 


81 


What evaluation tools are to be used? 


(4.5) 


(7.1) 


(5.4) 


(13.3) 




49 


50 


42 


74 


What methods of evaluation are suggested or required? 


(5.5) 


(8.2) 


(7.6) 


(10.4) 




41 


40 


37 


68 


Who has responsibility for conducting the teacher evaluation? 


(5.6) 


(10.6) 


(4.8) 


(14.6) 




36 


28* 


38 


75 


When are evaluations conducted? 


(5.1) 


(6.6) 


(7.3) 


(15.4) 




32 


31 


27 


39 


How is the teacher evaluation policy to be communicated to teachers? 


(5.7) 


(8.5) 


(8.1) 


(19.9) 




31 


34 


22 


39 


What formal grievance procedures are to be in place for teachers? 


(4.4) 


(7.4) 


(4.3) 


(17.8) 


Evaluation results 




48 


52 


39 


43 


How are teacher evaluation results to be used? 


(6.6) 


(8.4) 


(10.7) 


(22.2) 




31 


27 


31 


30 


How are teacher evaluation results to be reported? 


(6.9) 


(11.4) 


(10.3) 


(14) 



* indicates a difference between rural and urban or between suburban and urban that was statistically significant using a two-tailed test at 95 percent. 

Note: Numbers in parentheses are standard errors. Independent samples chi-square tests were performed to determine whether urban districts were more 
likely than rural or suburban districts to address any one of the 13 evaluation practices. 

Source: Authors' analysis based on data described in box 1 and appendix A. 
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experienced teachers are 
usually evaluated once 
every two or three years. 
Several other patterns 
emerge and are explored 
in the following sections. 

Many district policies distinguish 
between beginning and 
experienced teachers 



The collected district 
policies present a 
somewhat uneven 
picture of teacher 
evaluation in the 
sampled Midwest 
Region districts, but they 
contain many interesting 
leads for further research 



criteria — was a modest mention of how to use 
evaluation results. Nearly half the district poli- 
cies detailed how the evaluation results were to be 
used (48 percent), although fewer than one-third 
specified how these results were to be reported to 
teachers (31 percent). 

The extent to which the 13 evaluation practices 
were addressed in district policy varied. Two- 
thirds of district policies provided guidance for 
fewer than half of the 13 practices. No district 
policies specified more than 10 evaluation prac- 
tices, and nearly 16 percent of them reflected none 
of these practices. 

Independent sample chi-square tests were per- 
formed to determine whether urban districts 
were more likely than rural or suburban districts 
to address any one of the 13 evaluation practices. 
Compared with rural districts, urban district 
policies were significantly more likely to differenti- 
ate evaluation practice for special population or 
content area teachers (z = -3.21 l,p < .05, two- 
tailed) and to specify when the evaluations should 
be conducted (z = -2.44, p < .05, two-tailed). No 
statistically significant differences were found 
between the urban and suburban districts. 



Districts that articulated when and how often 
teacher evaluations should be conducted fre- 
quently distinguished between teachers with 
varying experience. Of the 94 districts that 
included policy or procedural statements to guide 
the frequency of evaluations, 88 percent (standard 
error = 3.2 percent) specified how often beginning 
teachers should be evaluated. Specifically, 63 per- 
cent (standard error = 5.9 percent) of districts 
required beginning teachers be evaluated at least 
two to three times a year, usually by a combination 
of informal observations, formative evaluations, 
and formal summative evaluations. 



Fewer than half the policies detail 
how results are to be used 



COMMON PATTERNS IN TEACHER EVALUATION 
POLICIES IN THE MIDWEST REGION 

The collected district policies present a some- 
what uneven picture of teacher evaluation in the 
sampled Midwest Region districts, but they con- 
tain many interesting leads for further research. 
The findings in this section pertain only to the 
sample and not to the region as a whole, however, 
because they are based on small numbers of dis- 
tricts that referenced a given practice within their 
policy documents. Districts in the sample view 
teacher evaluation mainly as a tool for making 
decisions about retaining or releasing new teach- 
ers. School principals and administrators do most 
of the evaluations and, at the district’s direction, 
focus on beginning teachers. Beginning teachers 
are typically evaluated at least two times a year; 



Fewer than half the districts spelled out how 
the teacher evaluation results would be used (48 
percent, standard error = 6.6 percent) or outlined 
procedures for filing official grievances (31 per- 
cent, standard error = 4.4 percent). When details 
were provided, policy documents typically noted 
that evaluations would be used to make decisions 
about retaining probationary staff or dismissing 
nonprobationary staff. 



Few districts referenced using resources 
or guidance to support evaluations 

Only 29 of 140 districts (21 percent, standard 
error = 3.1 percent) provided documentation about 
the resources and guidance they used to inform 
the teacher evaluation process. While this seems 
a surprisingly low number, participating districts 
may leave it up to administrators and supervisors 
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to find and use supporting resources. Districts 
may also provide administrators with resources 
and guidance that are not mentioned in the pro- 
cedures submitted to the researchers. A follow-up 
study could further investigate — through inter- 
views or surveys of district and union personnel — 
what resources and guidance are most widely 
used. Research is also needed to determine how 
different models of teacher evaluation affect 
teacher effectiveness and student achievement. 



Most evaluations are used for summative reports 
rather than for professional development 

Of the 43 districts whose policies specified how 
evaluation results should be reported, 34 (79 
percent, standard error = 6.2 percent) specified 
that results should be reported in summative form. 
Milanowski’s (2005) quasi-experimental study 
suggests that the type of evaluation (formative or 
summative) is less important to professional devel- 
opment than the quality of developmental support, 
the credibility and accessibility of a mentor, and 
the personal compatibility of the evaluator with the 
person being evaluated — especially for beginning 
teachers. Only 45 percent (standard error = 6.4 
percent) of districts whose policies provided for 
how evaluation results would be used, however, 
stated that the results would inform professional 
development, intensive assistance, or remediation 
for teachers receiving unsatisfactory evaluations. 



Few district policies require evaluators to be trained 



research is needed to examine the link between 
effective training and evaluation. A systematic 
descriptive study examining the components 
of current evaluator training requirements may 
inform future experimental research. 



Vague terminology leaves evaluation 
policies open to interpretation 

The definition of evaluation varied across districts. 
Even when districts used more targeted terms — 
formative evaluation, summative evaluation, and 
formal and informal evaluation, to name a few — it 
was often difficult to understand from the docu- 
ments what the districts actually intended. The 
term “observation” further complicates under- 
standing the evaluation process. Some districts 
described pre-observation and post-observation 
conferences. Others simply mentioned that 
observations were to take place, without describ- 
ing what those observations might include. A few 
provided a detailed observation and evaluation 
process that left no question about how the process 
was to be conducted. Ellett and Garland (1987) 
and Loup et al. (1996) did not report this problem 
of varied definitions because their primary data 
source was a survey. But looking at the policies 
revealed an interesting pattern of vague terminol- 
ogy. further study about using precise language in 
evaluation policies could be useful. For instance, 
do certain types of districts intentionally use 
vague language intended for broad interpretation? 
When, and under what conditions? 



Only 8 percent (standard error = 2.1 percent) 
of district policies included information about 
required training for evaluators. Two percent 



(standard error 



Further research is 
needed to determine 
the extent to which 
districts provide other 
opportunities for 
evaluation that are not 
formally stated in policy 



= 1.6 percent) specified only that 
evaluators must have obtained ap- 
propriate certification or licensure 
to supervise instruction. The 
remaining 6 percent (standard 
error = 2 percent) specified that, 
in addition to administrative 
certification, evaluators must par- 
ticipate in state-sponsored or other 
training to evaluate teachers. More 



Further research is also needed to determine the 
extent to which districts and schools provide other 
opportunities for evaluation that are not formally 
stated in policy but that are expected or encour- 
aged. Examples might include peer reviews, criti- 
cal friends, action research, and self-evaluation. 

A standard survey with a full round of follow-up 
promptings could enable a comparison of what is 
contained in districts’ written policy and proce- 
dures and what actually occurs in the district. A 
survey of principals and teachers could provide a 
better understanding of how effectively the for- 
mally stated district policies are implemented. 
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APPENDIX A because it was strongly and positively correlated 

METHODOLOGY with district locale. 



Sampling procedure 

This study used a stratified sample design to select 
school districts that demographically represented 
the majority of districts across the Midwest 
Region. A data file containing 3,126 public school 
districts in Illinois, Indiana, Iowa, Michigan, Min- 
nesota, Ohio, and Wisconsin was generated using 
the National Center for Education Statistics Com- 
mon Core of Data. The SPSS Complex Samples 
add-on module was used to create the sample and 
to account for the sample design specifications. 
Districts were stratified and selected to represent 
the total number of districts in the region. 

Two hundred sixteen sample districts were se- 
lected according to locale (urban, suburban, rural), 
percentage of students in the district eligible for 
free or reduced-price lunch (less than or more 
than 40 percent of eligible students), and percent- 
age of minority student enrollment in the district 
(less than or more than 40 percent non-Caucasian 
minority students). District size was initially 
included in the selection strata, but it was dropped 



Schools with more than 40 percent of students eligi- 
ble for free or reduced-price lunch qualify as “school- 
wide program” schools, which have more flexibility 
in the use of Title I funds and the delivery of services. 
For example, staff members must work together to 
develop curriculum and instruction and are free to 
work with all students in the building. Midwest state 
education staff indicated that the flexibility offered to 
schools through the school-wide program can influ- 
ence local policy decisions, including those related to 
teacher evaluation. The 40 percent point for minority 
enrollment was set after conversations with regional 
educators who felt that 40 percent represents a critical 
mass that begins to significantly affect a district s 
policies and procedures. 

The sample’s demographic characteristics and re- 
sponse rates are provided in table Al. See appendix 
C for tables comparing the total number of districts 
within each stratum — both across the region and 
within each state — with the sample districts. 

Of the recruited districts, 76 did not participate. 
Further follow-up with these districts would be 



TABLE Al 

Sample demographic characteristics and response rate 



District characteristics 


Districts recruited 


Districts participating 


Response rate (percent) 


Locale 


Rural 


104 


61 


59 


Suburban 


99 


67 


68 


Urban 


13 


12 


92 


Total 


216 


140 


65 


Free or reduced-price lunch 


More than 40 percent of students eligible 


51 


33 


65 


Less than 40 percent of students eligible 


165 


107 


65 


Total 


216 


140 


65 


Minority enrollment 


More than 40 percent non-Caucasian minority 


14 


12 


86 


Less than 40 percent non-Caucasian minority 


202 


128 


63 


Total 


216 


140 


65 



Source: Authors' analysis based on data described in this appendix. 
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useful to determine whether nonrespondents 
are less likely to have formalized teacher evalu- 
ation policies and practices and thus less likely 
to respond because they did not understand the 
importance of the study to their situations. 



Data collection 

The Midwest Regional Educational Laboratory, 
with the support of the region’s state education 
agencies, sent letters to the 216 district superin- 
tendents to inform them of the study’s goals and 
invite them to participate. To reduce the burden 
on districts, the study authors downloaded teacher 
evaluation policies from the district web sites 
when possible. If the policies were unavailable, the 
National Opinion Research Center (NORC) used a 
variation of the Tailored Design Method (Dillman, 
2000) to collect the data — first contacting school 
districts with an advance letter, followed by a 
reminder post card and a second reminder letter. 

NORC used FedEx for the second mailing because 
many district officials reported not receiving the 
documents through U.S. Postal Service standard 
mail. The letters described the study goals, listed 
the research questions, and requested district 
policy and procedural documentation for teacher 
evaluations (see box A1 for the list of research 



questions). The respondents were informed that 
electronic versions were preferred to facilitate 
analyzing the data with the NVivo software. If 
districts were unable or unwilling to provide 
electronic documents, FedEx post-paid envelopes 
were provided for mailing hard copies of the policy 
documents. NORC staff prompted districts that 
did not provide documentation within two weeks 
of the first mailing, in many cases making mul- 
tiple calls. The prompters also e-mailed and faxed 
documentation to districts. The response rate for 
each of the Midwest Region states is tabulated in 
table A2. 



TABLE A2 

Sample district participation rates by state 



State 


Districts 

recruited 


Districts 

participating 


Response rate 
(percent) 


Iowa 


31 


24 


77 


Illinois 


58 


30 


52 


Indiana 


20 


16 


80 


Michigan 


38 


26 


68 


Minnesota 


23 


13 


57 


Ohio 


16 


13 


81 


Wisconsin 


30 


18 


60 


Total 


216 


140 


65 



Source: Authors' analysis based on data described in this appendix. 



BOXA1 

Research questions 

The research questions were adapted 
from the Teacher Evaluation Prac- 
tices Survey (TEPS) (Ellett & Gar- 
land, 1987; Loup et al, 1996). 

Evaluation standards and criteria 

• What specific criteria are to be 
evaluated? 

• What external resources were 
used to inform the evaluation?’^ 

• What training is required of 
evaluators? 



• Do evaluation policies differ 
by content area and/or special 
populations?’^ 

Evaluation processes 

• How often are evaluations to be 
conducted? 

• What evaluation tools are to be 
used? 

• What methods of evaluation are 
suggested or required? 

• Who has responsibility for con- 
ducting the teacher evaluation? 

• When are evaluations 
conducted?’*^ 



• How is the teacher evaluation 
policy to be communicated to 
teachers? 

• What formal grievance procedures 
are to be in place for teachers? 

Evaluation results 

• How are teacher evaluation 
results to be used? 

• How are teacher evaluation 
results to be reported? 

Note 

• These questions were added to the TEPS 
categories to capture additional details 
about evaluation requirements. 
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Eight districts explicitly declined to participate 
(three Illinois districts, two Michigan districts, 
one Minnesota district, and two Wisconsin 
districts). Five of the eight did not give a reason. 
Two indicated that they were understaffed, and 
one that it did not have a teacher evaluation policy 
or documentation available. Sixty-five districts 
simply did not respond to the recruitment letter. 



Data analysis 

Districts submitted formal and informal policy 
documents. Formal documents included official 
district documents, such as contracts, employee 
handbooks, district and school web sites, and 
district evaluation forms. Informal documents in- 
cluded any documents created for this study, such 
as e-mailed and handwritten responses to analysis 
questions. The informal and formal documents 
were coded for this study. Twenty-five districts 
provided both formal and informal documents 
and nine provided only informal documents. 

The data were first coded by research question 
and then by the common and unique policies and 
procedures reported by each district. The codes 
were then grouped according to common themes. 
Summary reports were created for each research 
question, using counts of codes and representative 
quotations from the documents to clarify the find- 
ings and identify areas for further inquiry. 

The data were analyzed using SPSS Complex Sam- 
ples add-on module, which calculates common 
statistics, such as the standard errors reported in 
this paper, based on the parameters of the plan 
used to select the representative sample. Because 
different groups had different sampling probabili- 
ties in the sample selection process, these weights 
were also incorporated into the data analysis by 
the Complex Samples add-on. 



Limitations 

Given the lack of information about teacher evalu- 
ations and the nature of the policy documents 



collected, this study is mainly descriptive and has 
several limitations. 

Policy and procedural documents may not tell the 
whole story. All the district documents may not 
have been provided. Thus the study may not be an 
exhaustive profile of the teacher evaluation process 
in all the districts. And policy documents may not 
always reflect actual evaluation practices and pro- 
cesses. One district policy document stated that “a 
teacher’s work will be evaluated at least once every 
two years, and a written report shall be made on 
each teacher by the principal, curriculum coordi- 
nator, or supervisor.” An administrator from this 
district specified, however, that “new teachers are 
evaluated (two) times/year,” while teachers who 
have two years of experience are evaluated “once 
every two years.” Several districts also referenced 
formal documents that were not provided. 

Only district-level policies and procedures were ex- 
amined. Individual schools may have substantial 
autonomy in their evaluation policy and practice 
and may have formal documentation developed 
independent from the district central office. 
School-level policies were not included in this 
study, however. This study also did not look at how 
closely schools adhere to the district evaluation 
policies and procedures. 

Sample sizes in some demographic categories are 
small. District responses are broken down by 
locale (urban, rural, and suburban). In some cases, 
the samples for these demographic categories are 
small and should thus be interpreted with caution. 
Reported categorical differences may not always 
reflect the region as a whole. 

Teacher contracts were rarely included in the 
analyses. Teacher contracts likely contain informa- 
tion about teacher evaluations, but the Midwest 
Regional Educational Faboratory received very few 
teacher contracts from participating districts. This 
missing data source makes it likely that the poli- 
cies used to address the research questions do not 
fully represent policies in the Midwest Region. 
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APPENDIX B 

STUDY FINDINGS BY RESEARCH QUESTION 

This summary of findings includes the percent of 
districts with policy documents addressing the 13 
research questions and identifies patterns in the 
policy prescriptions. Standard errors are included 
for population estimates of districts with policy 
addressing each research question. The standard 
errors represent the variability of the estimates. 

The larger the reported standard error, the less 
precise the estimate. For estimates disaggregated 
by locale, see table 1 in the main body of the report. 
The disaggregated results by minority and free or 
reduced-price lunch are not discussed because dif- 
ferences between the groups showed little variance. 

The patterns of detail described in the policy 
descriptions within each research question should 
be interpreted with caution. The comparisons are 
based on small numbers of districts that refer- 
enced a given practice within their policy docu- 
ments. These findings pertain only to the sample 
and not to the region as a whole. 

1. What specific criteria are to be evaluated? 

Fifty-four (39 percent, standard error = 6.1 per- 
cent) districts identified the specific evaluation 
criteria. Forty-nine reported that more than one 
criterion should be used to evaluate teachers, 25 
identified five to six criteria, and 24 fewer than 
four. The following sections group the district 
criteria into five categories used for coding. 

Knowledge and instruction. The category “knowl- 
edge and instruction” refers to teacher content and 
pedagogical knowledge. If a district reported that a 
teachers content knowledge or pedagogical strategies 
should be evaluated, then the district was counted 
under the “knowledge and instruction” column. One 
rural district measured teachers’ knowledge and 
pedagogical practices by observing and critiquing 
teacher “strategies to deliver instruction that meets 
the multiple learning needs of students.” Of the 54 
districts that detailed evaluation criteria, 81 percent 
included “knowledge and instruction” as a criterion. 



Monitoring. “Monitoring” refers to teachers’ 
monitoring of student learning, assessing student 
performance, and reflecting on their own teach- 
ing performance. Half the districts with policies 
detailing the areas of evaluation included “moni- 
toring.” Twenty-two districts dictated that the 
teacher evaluations should look at whether teachers 
examine their students’ performance through 
measures such as assessment data. Nine districts 
required that teachers be evaluated on two or more 
of the monitoring subcomponents. Six districts 
required evaluations to determine how teachers use 
self-reflection to respond to student needs. Eight 
districts required teachers to provide demonstrated 
knowledge of their students’ background and skills. 

Professional responsibilities. Forty-three of the 
54 districts (80 percent) use the “professional 
responsibilities” criterion, which evaluates teach- 
ers’ professional development, fulfillment of 
responsibilities, and communication with col- 
leagues, students, and families as part of their 
evaluation. Eighteen districts required evaluations 
to contain information on teacher communication 
and feedback, and 15 district policies stated that 
teachers should be evaluated on how they fulfill 
their professional responsibilities (involvement in 
school and district activities, adherence to school 
and district policies, cultivation of professional 
and student relationships). Seven districts required 
teachers to be evaluated on their general “profes- 
sional development.” 

Classroom management. The “classroom manage- 
ment” criterion refers to a teacher’s ability to en- 
gage students and to maintain a positive learning 
environment. Forty-seven districts (87 percent) 
included classroom management as part of their 
evaluation, 27 required a general focus on class- 
room management, and 25 required an evaluation 
of teachers’ abilities to maintain positive learning 
environments (an atmosphere of respect and risk 
taking). Nine district documents dictated that stu- 
dent engagement should be part of the evaluation. 

Planning and preparation. The “planning and 
preparation” criterion refers to teacher use of 
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goals, lesson plans, and school resources to 
enhance instruction. Thirty-six district policies 
focused on teachers’ planning and preparation. 
Twenty districts used the phrase “planning and 
preparation” as a key component in evaluation, 
and 10 districts required that teachers be evalu- 
ated on use of school resources — such as technol- 
ogy or computers — to enhance student learning. 
Eight districts required an evaluation of teachers’ 
lesson plans, and five required teachers to be 
evaluated on how they include instruction and stu- 
dent achievement goals. Seven districts required 
that teachers be evaluated in two or more areas 
related to “planning and preparation.” 

Other criteria. The evaluation requirements of two 
districts did not fit in the evaluation categories 
and were coded as “other.” One district stated that 
“evaluation criteria are established by the board 
of education and are subject to change,” while the 
other evaluated teachers according to “a break- 
down of the standards to which each teacher is 
held and on which each teacher is evaluated.” 



2. What external resources were 
used to inform evaluations? 

Only 29 districts (21 percent; standard error = 3.1 
percent) identified resources — such as evaluation 
models and standards frameworks that informed 
their evaluation policies and procedures. 

Specific evaluation models and guidance used to 
guide teacher evaluation practice. Of the 29 districts 
that provided documentation about guidance and 
resources used, 10 (34 percent) referenced the use 
of Danielson’s Framework for teacher evaluation, 
while 4 (14 percent) noted the National Board for 
Professional Teaching Standards’ five core proposi- 
tions of accomplished teaching, Schlecty’s (1998) 
Standard Bearer Quality Work Framework, the 
Mayerson Academy’s Essential Principles of High 
Performing Teaming and heading, or Manatt’s 
Teacher Performance Evaluation (1988). 

State standards used to guide teacher evaluation 
practice. A similar number of district policies 



identified state standards as informing teacher 
evaluation — two used the Illinois Teaching Stan- 
dards and seven the Iowa Teaching Standards. 

Other models. Five of the 29 district policies (17 
percent) stated evaluators were provided with 
specific evaluation resources but did not identify 
the resources. 



3. What training is required of evaluators? 

Only 11 districts (8 percent, standard error = 2.1 
percent) had written documentation detailing any 
form of training requirements for evaluators. In 
several cases, districts did not provide any details 
about the training — other than that it was about 
teacher evaluation — or about the length of the 
training. A few districts stated that trainings were 
one or two days long. Documentation in one dis- 
trict suggested that new administrators be paired 
with experienced administrators in the district. 
This program appeared to be a mentoring program 
for new administrators, although follow-up is 
needed to confirm this. The following sections de- 
scribe the training categories identified in district 
policies and procedural documents. 

State certification and licensure requirements. Five 
districts indicated that evaluators must obtain 
state school administration certification and li- 
censure (such as Type 75 administrator’s certifica- 
tion). These districts may thus not include peer or 
community member evaluations as a formal part 
of the evaluation process. Two districts required 
administrators to obtain additional certification in 
teacher evaluation. 

State-sponsored training. Three districts required 
evaluators to be trained by the state department of 
education or an affiliated state-sponsored organi- 
zation. This formal training generally occurred in 
the evaluator’s first year in the position. 

Other training provided. Four districts required 
other types of training. In one district first-year 
administrators were required to participate 
in a mentoring program with an experienced 
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administrator in the district. Two districts referred 
to “national” training in their documentation, but 
did not specify who provided this training or how 
long it lasted. The fourth district indicated that 
its administrators attend a two-day workshop on 
effective teacher evaluation, which was provided 
by an external education consultant. One of the 
previous four district policies stated that the state 
required administrators in certain districts to take 
state-level training. Further investigation is neces- 
sary to determine whether states actually require 
administrator training for teacher evaluation. 



4. Do evaluation policies differ by content 
area and/or special populations? 

District evaluation procedures were examined to 
determine whether they varied by content area 
(mathematics, language arts) or special popula- 
tions (special education, English language learn- 
ers). Seven of 140 participating districts (5 percent, 
standard error = 4.2 percent) reported having 
evaluation policies or procedures that differ by 
content area or special population. 

Evaluation procedures for special education teach- 
ers. Of the seven districts, six had evaluation pro- 
cedures for teachers who work with students with 
special needs (gifted students, special needs). In 
some cases, evaluators of special education teachers 
were required to use a rubric that varied slightly 
from that used for regular education teachers. One 
district explicitly held special education teachers 
to a different standard: “Teachers who are given 
unusual responsibilities or difficult situations in 
which to teach . . . will not be expected to meet the 
same performance standards as other teachers.” 

Evaluation instrument for nonclassroom teachers. 
One suburban district developed a specific tool for 
a summative evaluation of nonclassroom teachers. 



5. How often are evaluations to be conducted? 

Ninety-four of the participating districts (67 per- 
cent, standard error = 4.3 percent) specified how 
often evaluations should occur. Eighty-three 



districts further specified the frequency for begin- 
ning or “probationary” teachers (59 percent), 81 
for experienced or “tenured” teachers (58 percent), 
and 7 for unsatisfactory teachers (5 percent). 

Of the 83 districts with policies directing the 
frequency of evaluation for beginning teachers, 59 
specified that beginning teachers must be evalu- 
ated at least two or three times a year. Fourteen 
required at least one evaluation a year for begin- 
ning teachers, while the remaining 10 included 
other evaluation instructions. 

Of the 81 districts with policy addressing the 
frequency of evaluation for experienced teachers, 
42 required that experienced or “tenured” teach- 
ers be observed once every three years. Twenty- 
two required at least one evaluation a year for 
experienced teachers, while six districts required 
more than one a year. Three districts required an 
evaluation only once every four or five years. One 
district policy indicated that a formal evaluation 
be conducted every three years for tenured teach- 
ers, with several informal evaluations occurring 
between cycles. 

Surprisingly, district policies rarely specified 
procedures for unsatisfactory teachers. Only seven 
of the 140 districts participating in the study (5 
percent) contained statements about unsatisfactory 
teachers. Five required that unsatisfactory teach- 
ers be evaluated several times a year. One district 
did not indicate a specific number of evaluations, 
but required that an intensive improvement plan 
be implemented: “If the summative evaluation is 
below district standards, the evaluator shall set 
forth in writing steps that shall be taken to improve 
the performance to meet the district standards.” 



6. What evaluation tools are to be used? 

Policies were analyzed to determine what evalua- 
tion tools districts use in the evaluation process. 

Of the 140 participating districts, 83 (59 percent, 
standard error = 4.5 percent) specified tools for 
schools to use in evaluating teachers. Most tools 
were used for summative evaluations and included 
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quantitative rating scales to assess teacher perfor- 
mance. In some cases, these quantitative forms in- 
cluded a limited number of open-ended questions, 
allowing the rater to support his or her ratings with 
descriptive information. A few districts also used 
classroom observation tools that allowed evalu- 
ators to describe the classroom and teacher and 
student behaviors. In a few cases policies coupled 
evaluation tools with teacher self-assessment tools. 
Examples of typical summative and formative tools 
are provided in appendixes D and E. 



districts, 13 (19 percent) required portfolios, 5 
(7 percent) required individual professional develop- 
ment or growth plans, and 5 (7 percent) suggested 
or required other evaluation methods such as 
daily contact, or “weekly lesson plans reviewed by 
principal at least six times a year.” Most professional 
development or growth plans were designed “to im- 
prove the quality of instruction through a process of 
goal setting and collegial sharing.” Such plans typi- 
cally included requirements to align the professional 
development plan with district and school goals. 



7. What methods of evaluation are suggested or required? 

Of the 140 participating districts, 68 (49 percent, 
standard error = 5.5 percent) reported that their 
written policy either suggested or required specific 
evaluation methods. The three methods specifi- 
cally noted were classroom observations, portfolio 
assessment, and professional development plans. 

Classroom observations. Of the 68 districts, 41 (60 
percent) suggested or required formal observations, 
including scheduled observations. The definition 
of “formal observation” was not always clear in 
district documentation, but the language suggested 
that it often included a “pre-conference, formal 
observation, and post-conference.” Teachers were 
aware ahead of time that they would be observed. 
An evaluation instrument was often used to docu- 
ment feedback, which was later shared with the 
teacher and placed in a permanent file. Twenty-four 
of these 41 districts (35 percent of the total 68 dis- 
tricts), in addition to requiring formal observations, 
also suggested or required informal observations, 
including unscheduled or unannounced observa- 
tions. Districts requiring formal observations also 
suggested the use of informal observation. Informal 
observations were often referred to as classroom 
“walkthroughs” or “visitations.” Furthermore, 21 
of the 68 districts (31 percent) articulated specific 
evaluation methods for new, nontenured, and pro- 
bationary teachers, and 10 (15 percent) had distinct 
evaluation methods for tenured teachers. 

Portfolio, professional development and growth 
plans, and other evaluation methods. Of the 68 



8. Who has responsibility for conducting 
the teacher evaluation? 

Of the 140 districts that provided policy and pro- 
cedural documentation, 57 (41 percent, standard 
error = 5.6 percent) identified who is responsible 
for conducting teacher evaluation. Forty-four 
identified building administrators — principals, 
vice principals, or content specialists — as respon- 
sible. Seven districts (5 percent) had policies that 
directed district administrators and supervisors 
to conduct teacher evaluations. Two suburban dis- 
tricts had rather unique strategies of peer evalu- 
ations under certain circumstances. One allowed 
for “co-teaching” as an evaluation method after 
the teacher completed the first year with a satis- 
factory rating. The other required all nontenured 
teachers to be evaluated at least once a semester by 
a director, but peers could also provide feedback 
on lesson plans, exams, and instruction. 



9. When are evaluations conducted? 

The timely provision of constructive feedback 
is one of the important aspects of evaluation, 
especially for new teachers who may find the first 
year or two of teaching tremendously challenging. 
Fifty-one districts (36 percent, standard error = 5.1 
percent) included language about when to conduct 
evaluations for beginning (or probationary) and 
for career (tenured) teachers. 

District documentation specified the timing of 
evaluations for beginning teachers more often than 
for teachers with more experience. Twenty out of 
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the 51 district policies mentioned above contained 
language concerning new teachers (probationary, 
nonpermanent). Many rural and urban school dis- 
trict policies specified when beginning teachers are 
to be evaluated — usually in the fall. The remaining 
31 district policies did not have different policies 
for teachers with different levels of experience, but 
included general language such as “teacher evalua- 
tions will take place throughout the school year.” 



10. How is the teacher evaluation policy 
to be communicated to teachers? 

Of 140 districts, 45 (32 percent, standard error = 5.7 
percent) communicated evaluation policy to teach- 
ers: 24 percent through teacher contracts,' 36 percent 
through the teacher handbooks, 36 percent through 
group orientation, and 33 percent through one-on- 
one communication with the supervisor. Most of the 
45 districts included more than one of these methods 
in their policy. The fact that only one-third of partici- 
pating districts included information specifying how 
district evaluation policies should be communicated 
to teachers raises questions about the consistency of 
evaluation practice, both within and across schools, 
and about the criteria used in making decisions about 
professional development, tenure, and termination. 



11. What formal grievance procedures 
are to be in place for teachers? 

Of the 140 participating districts, 44 (31 percent, 
standard error = 4.4 percent) submitted policies 
that included teachers’ rights to file grievances. 
Most districts authorized teachers to provide an 
addendum to the evaluation (82 percent), but some 
gave teachers the right to request another evalu- 
ation (7 percent), to request a different evaluator 
(5 percent), or to allow an arbitrator to review the 
evaluation (5 percent). 



12. How are teacher evaluation results to be used? 

The written evaluation policies of 67 of the 140 
participating districts (48 percent, standard 
error = 6.6 percent) stipulated how the evalua- 
tion results should be used: to inform personnel 



decisions (60 percent), to suggest instructional 
improvements (39 percent), to inform professional 
development goals (28 percent), and to use for 
remediation (12 percent). 

To inform personnel decisions. Policies from 40 
districts (60 percent) state that evaluation results 
should be incorporated into teacher employment 
status and personnel decisions, in particular those 
of probationary and nontenured teachers. 

To suggest improvement. The policies from 26 
districts (39 percent) called for evaluations to help 
teachers improve their practice and in particular 
their identified areas of weakness. Nineteen (28 
percent) district documents stated that the evalu- 
ation results would be used to determine profes- 
sional development needs. The resulting profes- 
sional development initiatives may be internal to 
the district or external. Documentation from eight 
districts (12 percent) was coded as “remediation 
reevaluation,” which meant that a poor evaluation 
resulted in “intensive assistance” for the teacher. 

Other uses. Eight district policies (12 percent) 
included only very broad uses for evaluations. One 
district indicated that teacher evaluation results 
were used “to comply with mandates.” Another 
district used results “to improve the educational 
program.” Such all-encompassing uses allow for 
expansive interpretation. 



13. How are teacher evaluation results to be reported? 

Of the 140 participating districts, 43 (31 percent, 
standard error = 6.9 percent) had policies stating 
how evaluation results should be compiled and 
reported. Several reporting methods emerged, in- 
cluding the use of summative evaluation forms (79 
percent), written narratives (30 percent), formative 
evaluation (19 percent), and teacher conferences 
(14 percent). 

Summative evaluation forms. Of the 43 districts that 
specified how to report the evaluation results, 34 
(79 percent) required that a completed summative 
evaluation form be signed by the evaluator and the 
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teacher and then stored in the teacher’s personnel 
file. In some cases, the superintendent also received 
a copy. The form itself was often the data-gathering 
tool used to assess the teacher’s performance. 

Written narratives. Documentation from 13 
districts (30 percent) explicitly required a writ- 
ten narrative. One policy stated that “narrative 
summaries are to include specific teaching as- 
signments, length/service in the district, dates of 
observations, specific strengths, weaknesses or 
areas cited for improvement needed, improvement 
tasks required, specific contract recommendations 
and the teacher’s signature and the evaluator’s 
signature.” The narratives were often considered 
the summative evaluation report. 



Multiple formative evaluations. Eight policies 
use data from formative evaluations conducted 
throughout the school year to complete a sum- 
mative evaluation form. The formative evalua- 
tion forms were typically required for classroom 
observations. In some districts the set of behaviors 
and practices observed in the formative and sum- 
mative evaluation was the same. 

Conferences. Documentation from six districts 
explicitly stated that the administrator, supervi- 
sor, or evaluator was to hold a conference with the 
teacher to discuss summative evaluation findings. 
Some districts required that the conference take 
place before the summative evaluation report was 
filed. 
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APPENDIX C 

DESCRIPTION OF TEACHER EVALUATION 
STUDY STRATIFIED RANDOM SAMPLE 



TABLE Cl 

Number of school districts in the region and in the sample by state 



1 State 


Districts in region 


Districts in sample I 


Iowa 


370 


31 


Illinois 


887 


58 


Indiana 


291 


20 


Michigan 


553 


38 


Minnesota 


348 


23 


Ohio 


240 


16 


Wisconsin 


437 


30 


Total 


3,126 


216 



Source: Authors' analysis based on data described in appendix A. 



TABLE C2 



Free or reduced-price lunch eligibility by region and by sample 


1 Free or reduced-price lunch eligibility 


Districts in region 


Districts in sample 1 


Districts with iess than 40 percent of students eligibie 


2,390 


166 


Districts with more than 40 percent eligibie 


736 


50 



Source: Authors' analysis based on data described in appendix A. 



TABLE C3 

Minority student population by region and by sample 



1 Minority Student population 


Districts in region 


Districts in sample I 


Less than 40 percent non-Caucasian minority 


2,873 


202 


40-100 percent non-Caucasian minority 


248 


14 


Missing 


5 


0 



Source: Authors' analysis based on data described in appendix A. 
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TABLE C4 

District locale by state, regional data 



State 




District locale 






Urban 


Suburban 


Rural 


Total 


Iowa 


15 


92 


262 


369 


Illinois 


34 


516 


337 


887 


Indiana 


25 


143 


122 


290 


Michigan 


37 


269 


247 


553 


Minnesota 


14 


116 


215 


345 


Ohio 


22 


149 


69 


240 


Wisconsin 


26 


189 


222 


437 


Total 


173 


1,474 


1,474 


3,121 



Source: Authors' analysis based on data described in appendix A. 



TABLE C5 

District locale by state, sample data 



State 




District locale 




Total 


Urban 


Suburban 


Rural 


Iowa 


2 


6 


23 


31 


Illinois 


2 


33 


23 


58 


Indiana 


2 


10 


8 


20 


Michigan 


3 


18 


17 


38 


Minnesota 


1 


8 


14 


23 


Ohio 


1 


10 


5 


16 


Wisconsin 


2 


13 


15 


30 


Total 


12 


98 


100 


216 



Source: Authors' analysis based on data described in appendix A. 



TABLE C6 

Free or reduced-price lunch eligibility, regional data 





Free or reduced-price lunch eligibility 




State 


Less than 40 percent 
of students eligible 


More than 40 percent 
of students eligible 


Total 


Iowa 


310 


60 


370 


Illinois 


679 


208 


887 


Indiana 


223 


68 


291 


Michigan 


362 


191 


553 


Minnesota 


246 


102 


348 


Ohio 


184 


56 


240 


Wisconsin 


386 


51 


437 


Total 


2,390 


736 


3,126 



Source: Authors' analysis based on data described in appendix A. 
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TABLE C7 

Free or reduced-price lunch eligibility, sample data 



Free or reduced-price lunch eligibility 



State 


Less than 40 percent 
of students eligible 


More than 40 percent 
of students eligible 


Total 


Iowa 


25 


6 


31 


Illinois 


38 


20 


58 


Indiana 


13 


7 


20 


Michigan 


28 


10 


38 


Minnesota 


19 


4 


23 


Ohio 


13 


3 


16 


Wisconsin 


30 


0 


30 


Total 


162 


48 


216 


Source: Authors' analysis based on data described in appendix A. 



TABLE C8 

Minority student population, regional data 




Non-Caucasian minority student population 




State 


Less than 40 percent 


More than 40 percent 


Total 


Iowa 


366 


3 


369 


Illinois 


758 


129 


887 


Indiana 


279 


11 


290 


Michigan 


507 


46 


553 


Minnesota 


330 


15 


345 


Ohio 


217 


23 


240 


Wisconsin 


416 


21 


437 


Total 


2,873 


248 


3,121 


Source: Authors' analysis based on data described in appendix A. 



TABLE C9 

Minority student population, sample data 




Non-Caucasian minority student population 




State 


Less than 40 percent 


More than 40 percent 


Total 


Iowa 


31 


0 


31 


Illinois 


48 


10 


58 


Indiana 


20 


0 


20 


Michigan 


35 


3 


38 


Minnesota 


23 


0 


23 


Ohio 


16 


0 


16 


Wisconsin 


29 


1 


30 


Total 


196 


14 


216 



Source: Authors' analysis based on data described in appendix A. 
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APPENDIX D 

SUMMATIVE TEACHER EVALUATION FORM 

Teacher Administrator 

Building Subject/Grade Level(s) Date 

Dates of Observations: 

Date(s)ofI.D.P. Planning/Goal Setting 

Tenure: Yes No 

Probationary: 1 st year 2nd year 3rd year 4th year Mentor: 

Is a Professional/Individual Development Plan part of this Evaluation? Yes No 

I. Instruction 

S = Satisfactory NI = Needs improvement UN = Unsatisfactory NA = Not applicable/not observed 



Overall rating 


S 


NI 


UN 


NA 


1. Prepares for assigned classes and responsibilities 
(shows evidence of adequate preparation) 


s 


NI 


UN 


NA 


2. Demonstrates clear purpose and objectives 


s 


NI 


UN 


NA 


3. Provides instruction at the appropriate level of difficulty 
for each learner 


s 


NI 


UN 


NA 


4. Responds to the efforts of the learners and adjusts instruction to 
maximize learning by using a variety of methods and materials 


s 


NI 


UN 


NA 


5. Provides opportunities for active involvement of the learner 


s 


NI 


UN 


NA 


6. Monitors learning interactions and checks learners for understanding 


s 


NI 


UN 


NA 


7. Implements District approved curriculum 


s 


NI 


UN 


NA 


8. Demonstrates competency in subject matter 


s 


NI 


UN 


NA 


9. Appropriately assesses and records learner performance 


s 


NI 


UN 


NA 


10. Demonstrates productive use of time on task 


s 


NI 


UN 


NA 
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11. Appropriately utilizes available technological resources 


S 


NI 


UN 


NA 


12. Organizes instruction and monitors achievement toward 
mastery learning for all students 


S 


NI 


UN 


NA 


13. Utilizes current research-based instructional strategies to 
enhance learning 


S 


NI 


UN 


NA 


14. Monitors and adjusts to accommodate learning styles 


S 


NI 


UN 


NA 



Comments/recommendations on instruction: 



II. Environment 

S = Satisfactory NI = Needs improvement UN = Unsatisfactory NA = Not applicable/not observed 



Overall rating 


S 


NI 


UN 


NA 


1. Establishes an environment that focuses on student learning 

2. Takes all necessary and reasonable precautions to provide 


s 


NI 


UN 


NA 


a healthy and safe environment 


s 


NI 


UN 


NA 


3. Utilizes equipment, materials, and facilities appropriately 

4. Treats individuals within the school community with 


s 


NI 


UN 


NA 


dignity and respect 


s 


NI 


UN 


NA 



Comments/recommendations on instruction: 
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III. Communication 

S = Satisfactory NI = Needs improvement UN = Unsatisfactory NA = Not applicable/not observed 



Overall rating 


S 


NI 


UN 


NA 


1. Demonstrates active listening skills 


s 


NI 


UN 


NA 


2. Establishes and maintains open lines of communication 


s 


NI 


UN 


NA 


3. Demonstrates effective verbal and written communication 


s 


NI 


UN 


NA 



Comments/recommendations on instruction: 



IV. Policy and procedures 

S = Satisfactory NI = Needs improvement UN = Unsatisfactory NA = Not applicable/not observed 
Overall rating S NI UN NA 

1. Maintains records as required by law, district policy, 

and administrative regulations S NI UN NA 

2. Attends and participates in district, faculty and 

departmental meetings S NI UN NA 

3. Abides by school district policies, building procedures, 

master agreement and state and federal law S NI UN NA 



Comments/recommendations on instruction: 
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IV. Professionalism 

S = Satisfactory NI = Needs improvement UN = Unsatisfactory NA = Not applicable/not observed 



Overall rating 


S 


NI 


UN 


NA 


1. Participates in lifelong learning activities, (staff development, 
continuing ed., university studies and professional research 


s 


NI 


UN 


NA 


2. Creates a favorable professional impact by words, action, 
appearance and attitudes 


s 


NI 


UN 


NA 


3. Shares general school and district responsibilities 


s 


NI 


UN 


NA 


4. Establishes and maintains professional relations 


s 


NI 


UN 


NA 


5. Contributes to building and district mission and goals 


s 


NI 


UN 


NA 



Comments/recommendations on instruction: 



VI. Reviews of program/teaching goals and/or IDP: Overall rating S NI UN NA 

S = Satisfactory NI = Needs improvement UN = Unsatisfactory NA = Not applicable/not observed 

Comments/recommendations on instruction: 



Overall rating of the evaluation: S NI UN NA 

S = Satisfactory NI = Needs improvement UN = Unsatisfactory NA = Not applicable/not observed 



Teacher Signature 



Date Administrator Signature 



Date 
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APPENDIX E 

FORMATIVE TEACHER EVALUATION FORM 

Class taught Grade 

Date Teacher 



1 Pre-observation report 


Post-observation report I 


1. What topic/unit will be taught? Is this new input, 
practice on objectives, review, or a diagnostic lesson? 


2. What are the objectives for this lesson? 


2. 


Were the objectives observed during the lesson? 


3. What procedure will the teacher use to accomplish 
the objectives? 


3. 


Were the procedures implemented? Were they 
effective? 


4. What activities will the students e doing? 


4. 


Were the student activities implemented as planned? 
Were they effective? 


5. Which particular criterion/criteria do you want 
monitored? 


5. 


Indicate pertinent data gathered relevant to the 
criteria. 



A. Demonstrates effective planning skills 

B. Implements the lesson plan to ensure time on task. 

C. Provides positive motivational experiences 

D. Communicates effectively with students. 

E. Provides for effective student evaluation. 

F. Displays knowledge of curriculum and subject matter. 

G. Provides opportunities for individual differences. 

H. Demonstrates skills in classroom management 

I. Sets high standards for student behavior. 

J. Other: 



Form adapted from Manatt (1988). 
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NOTES 



The Midwest Regional Education Laboratory would 
like to recognize the National Opinion Research 
Center for its contribution to this report. Its sup- 
port in collecting district teacher evaluation poli- 
cies was critical to the completion of this report. 



1. Few teacher contracts were collected. The 
finding that evaluation policy was commu- 
nicated through teacher contracts was based 
on references identified in other district 
documents. 



